Image adjustment

ABSTRACT

A method for adjusting an image using message passing comprises associating each pixel of an image with a node of a graph and one or more cliques of nodes, determining for a node of the graph a respective set of possible pixel labels for which a unary potential is known, computing for that node a unary potential of a possible pixel label for which the unary potential is unknown, adjusting a clique potential associated with each clique to which that node belongs based on the unary potentials, and adjusting, based on the adjusted clique potential associated with each clique to which that node belongs, at least one of the messages between that node and the other nodes of each clique. Once a convergence criterion is met, an adjusted image is produced having pixel labels determined from the adjusted messages.

FIELD

This disclosure relates to image processing. In particular, but withoutlimitation, this disclosure relates to a method of adjusting an imageusing message passing.

BACKGROUND

As computer vision and image processing systems become more complex, itis increasingly important to build models in a way that makes itpossible to manage this complexity.

Maximum a-posteriori (MAP) inference in graphical models, and especiallyin random fields defined over image domains, is one of the most usefultools in computer vision and related fields. If all potentials definingthe objective are of parametric shape, then in certain cases non-linearoptimization is the method of choice for best efficiency. On the otherhand, if the potentials are not of a parametric shape, then methods suchas loopy belief propagation (BP) or its convex variants are the methodof choice. BP and related algorithms face two limitations if the statespace is large: first, the intrinsic message passing step requires atleast linear time in terms of the state space size, and it issuperlinear in general. Thus, the runtime of these methods does notscale well with the state space size. Second, the memory consumptiongrows linearly with the state space size, since belief propagationrequires the maintenance of messages for each state.

If the state space is huge, then even optimizing non-parametric unarypotentials (usually referred as data terms) by explicit enumeration maybe computationally too expensive for many applications (e.g. whenimplemented on embedded devices). Certain data terms allow moreefficient computation via integral images or running sums, and dataterms may need not be computed to full precision, but these methods areonly suitable for very specific problem instances.

The present invention seeks to provide improved methods and systems foradjusting an image.

SUMMARY

Aspects and features of an invention are set out in the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed incolor. Copies of this patent or patent application publication withcolor drawing(s) will be provided by the Office upon request and paymentof the necessary fee.

Illustrative embodiments of the present disclosure will now bedescribed, by way of example only, with reference to the drawings. Inthe drawings:

FIG. 1 shows the architecture of an example apparatus or device;

FIGS. 2a, 2b and 2c show example graphs comprising four nodes;

FIG. 3 shows the evolution of dual energies with respect to the numberof passes of the image for a dense stereo correspondence problem;

FIGS. 4a and 4b show converged disparity maps;

FIG. 5 is a graphical plot showing the primal energy evolution withrespect to number of passes over an image using different traversalschedules and weights according to a method disclosed herein;

FIG. 6 is a flowchart illustrating a method of adjusting an image;

FIG. 7 is a flowchart illustrating an optional process that may beintegrated with the process of FIG. 6;

FIG. 8 shows graphical plots depicting the evolution of primal energiesfor dense disparity estimation with respect to wall time;

FIG. 9 is a visual illustration of converging stereo results;

FIG. 10 shows the evolution of primal energies for dense optical flow;

FIG. 11 is a visual illustration of the convergence of the optical flowfield for the “Schefflera” dataset; and

FIG. 12 is a visual illustration of converging optical flow fields forvarious datasets.

Throughout the description and the drawings, like reference numeralsrefer to like parts.

DETAILED DESCRIPTION

A method for adjusting an image using message passing comprisesassociating each pixel of an image with a node of a graph and one ormore cliques of nodes, determining for a node of the graph a respectiveset of possible pixel labels for which a unary potential is known,computing for that node a unary potential of a possible pixel label forwhich the unary potential is unknown, adjusting a clique potentialassociated with each clique to which that node belongs based on theunary potentials, and adjusting, based on the adjusted clique potentialassociated with each clique to which that node belongs, at least one ofthe messages between that node and the other nodes of each clique. Oncea convergence criterion is met, an adjusted image is produced havingpixel labels determined from the adjusted messages.

A computer implemented method for producing an adjusted image usingmessage passing is provided. The method comprises receiving an inputimage that comprises a plurality of pixels. Each pixel has an inputvalue and each input value corresponds to one of a number of possiblepixel labels. The method further comprises associating each pixel with anode of a graph and one or more cliques of nodes Each possible pixellabel is associated with a unary potential, and, for each permutation ofpossible pixel labels of the pixels associated with the nodes of eachclique, a clique potential is associated. The method further comprises,for each clique, initialising at least one message between each pair ofnodes of that clique.

The method further comprises, for each node, a) determining, for thatnode, a respective set of possible pixel labels for which the unarypotential is known, b) computing a unary potential of a selectedpossible pixel label for which the unary potential is unknown andupdating the respective set of possible pixel labels to include theselected possible pixel label, c) for each clique to which that nodebelongs, based on the unary potentials associated with the updatedrespective set of possible pixel labels, adjusting the clique potentialassociated with that clique, and, d) for each clique to which that nodebelongs, adjusting, based on the adjusted clique potential associatedwith that clique, at least one of the messages between that node and theother nodes of that clique. For each node, steps a)-d) are repeateduntil a convergence criterion is met.

The method further comprises producing an adjusted image having pixellabels determined from the adjusted messages.

A unary potential is a mapping from a pair to a real value, the paircomprising a pixel and a label for that pixel. This mapping or functionmay be based on the image values near the pixel of interest. A unarypotential can be thought of as a cost associated with assigning a pixellabel or state or value to a pixel (node), A clique potential of aclique is a cost associated with assigning pixel labels or labels tonodes of that clique. Accordingly, in an optimisation problem in imageprocessing, the unary potentials and clique potentials may act as ameasure of how close a solution of the problem is to an optimalsolution. One may therefore attempt to find an optimal solution byfinding a minimum sum of unary potentials and clique potentials, i.e.the assignment of pixel labels to pixels for which there is a minimalcost.

A unary potential may be defined based on the image processing problemto be solved. For example, in some circumstances the unary potentialsmay all comprise constant values for a respective possible pixel label.In other circumstances, the unary potential may be defined by apolynomial function. Any suitable unary potential may be used andexamples will be clear from the examples below.

In some situations, such as when the pixel labels are to be used toidentify disparities or flow vectors, the unary potential for aparticular pixel is a function comparing an image patch centred at thatparticular pixel in the reference image with an image patch from asecond (“moving”) image centred at the particular pixel shifted by thelabel value.

In some situations, such as when the pixel labels are to be used toidentify an object category, the unary potential is a function that hasthe image patch centred at the particular pixel and an object categoryas arguments. The unary potential would therefore represent a costassociated with assigning the pixel to an object category. This functioncan be trained beforehand using, for example, machine learningtechniques.

The methods disclosed herein demonstrate a memory-efficient and fast wayof performing image processing operations such as object segmentation,distortion correction, blurring and zooming. By calculating a unarypotential for a selected possible pixel label in each iteration of stepsa) to d), not all of the unary potentials necessarily need to becalculated for the method to converge on a solution to an imageprocessing problem. Accordingly, the method requires less memory to beable to converge on a solution to an image processing problem, whilemaintaining a high level of accuracy.

The method may further comprise, after step a) and before step c),updating the respective set of possible pixel labels by removing one ofthe possible pixel labels therefrom. Optionally, for each clique towhich the node belongs, the method uses an associated parameter relatedto messages passed between the node and other nodes of that clique. Theparameter can be dependent on the possible pixel labels for that node.The one of the possible pixel labels that is removed from the respectiveset of possible pixel labels may be the possible pixel label of therespective set of possible pixel labels for which the sum of theassociated parameters is greatest.

At step c), adjusting the clique potential associated with that cliquemay comprise adjusting the clique potential based on a weighted sum ofthe unary potentials associated with the respective set of possiblepixel labels. The weighting may be an even or uneven weighting.Adjusting the clique potential associated with that clique may furthercomprise adding a constant value to the weighted sum for each possiblepixel label which is not included in the respective set of possiblepixel labels.

At step b), the selected possible pixel label may be selected from acandidate set of possible pixel labels. The candidate set of possiblepixel labels may be based on an estimated set of possible pixel labelsof an adjacent node for which the unary potential is known.

Repeating steps a)-d) for each node until a convergence criterion is metmay comprise repeating steps a)-d) a predetermined number of times.Repeating steps a)-d) for each node until a convergence criterion is metmay comprise repeating steps a)-d) until repetition does not change themessages.

At step b), computing a unary potential of a selected possible pixellabel may comprise estimating a unary potential of the selected possiblepixel label based on a unary potential associated with a possible pixellabel of an adjacent node in the graph. For example, if a possible pixellabel of an adjacent node is associated with a particular unarypotential (for example a constant value such as 1), then computing theunary potential of the selected possible pixel label may compriseassigning the particular unary potential (the constant value) to theselected possible pixel label. Such an assignment may indicate thatadjacent nodes are likely to take the same value, for example byindicating that two adjacent pixels are associated with the same surfacedepicted in an image.

At step d), adjusting at least one of the messages may compriseadjusting the at least one of the messages according to one or morepredetermined message weightings. For example, the messages may beevenly weighted. The messages may be unevenly weighted.

The unary potentials may be configured so as to cause the method toperform a segmentation operation upon the input image. The unarypotentials may be configured so as to cause the method to perform adistortion correction operation upon the input image. The unarypotentials may be configured so as to cause the method to perform anoptical flow operation upon the input image. The unary potentials may beconfigured so as to cause the method to perform a dense stereocorrespondence operation upon the input image. The unary potentials maybe configured so as to cause the method to perform an image denoisingoperation upon the input image. The unary potentials may be configuredso as to cause the method to perform a deblurring and/or zoomingoperation upon the input image. Furthermore the clique potentials may beinitially configured so as to cause the method to perform one or more ofsaid operations upon the input image.

An apparatus or system is provided, the apparatus or system arranged toperform a method as disclosed herein. The apparatus or system maycomprise input/output means arranged to receive an image. The apparatusor system may comprise a memory storing instructions for causing aprocessor to perform steps of a method as disclosed herein. Theapparatus or system may comprise a processor for performing steps of amethod as disclosed herein.

A non-transitory computer-readable medium is provided. Thecomputer-readable medium comprises machine-readable instructionsarranged, upon execution by one or more processors, to cause the one ormore processors to carry out a method as disclosed herein.

Further optional features will be appreciated from the followingdescription.

FIG. 1 shows the architecture of an example apparatus or device 100 forperforming the methods described herein. The apparatus or device 100comprises a processor 110, a memory 115, and a display 135. These areconnected to an optional central bus structure, the display 135 beingconnected via a display adaptor 130. The example apparatus or device 100also comprises an input device 125 (such as a mouse and/or keyboard) anda communications adaptor 105 for connecting the apparatus or device toother apparatuses, devices or networks. The input device 125 andcommunications adaptor 105 are also connected to the central busstructure, the input device 125 being connected via an input deviceadaptor 120. The example apparatus or device 100 also comprises anoutput device 145 (such as a supplementary display device and/or a soundsystem). The output device 145 is connected to the central bus structurevia an output device adaptor 140.

In operation the processor 110 can execute computer-executableinstructions stored in the memory 115 and the results of the processingcan be displayed to a user on the display 135. User inputs forcontrolling the operation of the computer may be received via inputdevice(s) 125. Results of at least part of the processing can also beoutputted via output device(s) 145.

MAP Inference and Graphical Models

One of the first steps in creating a computer vision system is theestablishment of the overall computational paradigm that will be used tocompute the final solution. One of the most flexible ways to implement asolution is through the combination of an energy function and maximuma-posteriori (MAP) inference.

The MAP inference strategy begins with the definition of a conditionalprobability distribution p(X|Y) where X is a vector of random variablesestimated from observations Y. In MAP inference, the actual estimate X*is found by finding the vector X* that maximises p(X|Y).

The connection with energy functions can be seen by expressing p(X|Y) asa Gibbs distribution:

${p( X \middle| Y )} = {\frac{1}{Z}{\exp( {- {\sum\limits_{C}{E( {X_{C};Y} )}}} )}}$

where E(X_(C);Y) denotes an energy function over a set X_(C) of elementsof X. Accordingly, the sum in the exponent is a sum over different setsof elements of X. The structure of these sets is characteristic of themodel used, as will be discussed below.

The constant Z is a normalisation constant that ensures that p(X|Y) is avalid probability distribution, and accordingly is not usually importantfor finding X Accordingly, it can be seen that to perform MAP inference,one must find the vector X* which maximises the energy function

${E( {X;Y} )} = {\sum\limits_{C}{{E( {X_{C};Y} )}.}}$

The next step in MAP inference for image adjustment is to decide what isthe form of the distribution p(X|Y) and it is useful to use a graphicalmodel.

A graph G is defined by a pair of sets, G=(ν,ε) with nodes, or vertices,s belonging to the set ν of vertices and edges e belonging to the set εof edges. Each pixel of a received image may be represented as a node,or vertex, on a graph. Relationships between different pixels of thereceived image may be represented by edges between nodes on the graph.

For example, assume that the vector X represents an image comprising 4pixels, with each pixel able to take on one of 256 values or labels. Ifone were to specify just p(X) and were to ignore the vector Y describingobservations then one would need to determine 256⁴ labels in order toaccount for every possible interaction between pixels. This is shown inFIG. 2A, which shows a graph comprising four nodes, each noderepresenting a pixel of a received image. In the graph, node A ispositioned next to node B and node C is positioned next to node D. NodesA and B are positioned above nodes C and D respectively. As there are noedges in the graph of FIG. 2A, no interrelationships between pixels areaccounted for or modelled.

If instead, one was to model some of the interactions between nodes, forexample the interactions between horizontal and vertical neighbours,then the number of required labels is reduced. In FIG. 2B, theinteractions between node pairs A & B, A & C, C & D, and B & D aremodelled and so only 4×256² labels need be considered to specify thedistribution. The reduction in required labels is due to the fact thatrelationships between diagonal neighbours are captured indirectly byhorizontal and vertical neighbours. For example, the interaction betweenB & C is not specified (there is no edge between nodes B & C on thegraph of FIG. 2B) but the interaction between B & C is indirectlycapture via the interactions between node pairs A & C and A & B, andnode pairs C & D and B & D.

FIG. 2C shows another graph of four pixels A, B, C & D, in which theinteractions between all nodes are explicitly modelled by edges. Themodel in FIG. 2C is more descriptive than the model in 2B.

A clique is a subset of nodes (vertices) of a graph that are completelyconnected i.e. every distinct node of a clique is adjacent every otherdistinct node of the clique. For example, in FIG. 2B, node A belongs toa clique comprising nodes A & B as nodes A and B are adjacent. Node Aalso belongs to a clique comprising nodes A & C as nodes A and C areadjacent. However, in FIG. 2B, there is no clique comprising nodes A & Das nodes A & D are not adjacent they are not connected by an edge.

In FIG. 2C, node A belongs to a clique comprising nodes A & B and aclique comprising nodes A & C. However, in FIG. 2C, node A also belongsto a clique comprising nodes A & D as nodes A & D are adjacent (there isan edge connecting nodes A & D). Additionally, node A belongs to aclique comprising nodes A & B & C, a clique comprising nodes A & B & D,a clique comprising nodes A & C & D, and a clique comprising nodes A & B& C & D.

The cliques loosely capture the direct interactions between nodes. Forexample, if node A of FIG. 2B is assigned a particular value or label,then such an assignment would have a direct effect on node B and node C.Accordingly, cliques are related to the model used for modelling thecomputer vision problem.

In view of the above, a labelling or MAP inference problem isdetermining the optimal label

_(s)∈

_(s) assigned at each node s∈ν_(s) where the objective is over unaryterms and clique terms (where cliques are indexed by Greek letters αetc.),

$\begin{matrix}{x^{*}\overset{def}{=}{{\arg \mspace{14mu} {\min\limits_{x}{\sum\limits_{s}{\theta_{s}( x_{s} )}}}} + {\sum\limits_{\alpha}{\theta_{\alpha}( x_{\alpha} )}}}} & ( {{Eq}.\mspace{14mu} 1} )\end{matrix}$

where x=(

_(s))_(s∈ν)∈

_(s∈ν)

_(s), and x_(α) and is the subvector of x comprising the labels assignedat nodes belonging to the clique, x_(α)=(

_(s))_(s∈α).

The symbol θ_(s)(

_(x)) represents a unary potential, or cost, associated with assigningthe label

_(s) to the node s. In other words, the potential θ_(s)(

_(x)) is the “energy” associated with assigning the label, or state,

_(x) to node s. The symbol θ_(α)(x_(α)) represents a clique potential,or cost, associated with assigning particular labels, or states, to thenodes of clique α. In other words, the potential θ_(α)(x_(α)) is the“energy” associated with assigning labels to the nodes of clique α andcan be thought of as an energy term associated with interactions betweennodes.

The label assignment problem of (Eq. 1) is generally intractable tosolve, and one highly successful approach to approximately solve thisproblem is to employ the corresponding linear programming (LP)relaxation,

$\begin{matrix}{{{E_{MAP}(b)}\overset{def}{=}{{\sum\limits_{s,x_{s}}{{\theta_{s}( x_{s} )}{b_{s}( x_{s} )}}} + {\sum\limits_{\alpha,x_{\alpha}}{{\theta_{\alpha}( x_{\alpha} )}{b_{\alpha}( x_{\alpha} )}}}}}{{s.t.\mspace{14mu} {b_{s}( x_{s} )}} = {{\sum\limits_{x_{\alpha}\backslash x_{s}}{{b_{\alpha}( x_{\alpha} )}\mspace{14mu} {\sum\limits_{x_{s}}{b_{s}( x_{s} )}}}} = 1}}} & ( {{Eq}.\mspace{14mu} 2} )\end{matrix}$

where E_(MAP)(b) is the energy function to be minimised. In (Eq. 2),b_(α)(x_(α))≧0. The expression x_(α)\

_(s) is shorthand for {x_(α)′:

_(s)′=

_(s)}. In what follows α

s is written instead of {α:s∈α}. The unknowns {b_(s)}_(s)∈ν and{b_(α)}_(α) are “one-hot” encodings of the assigned labels, e.g. if b*is the optimal solution of E_(MAP) and the relaxation is tight, thenb_(s)(

_(s)) is ideally 1 if and only if state

_(s) is the optimal label at node s and 0 otherwise (similar for cliquestates x_(α)). The first set of constraints is usually called the set ofmarginalization constraints, and the unit sum constraint is typicallyreferred as a normalization constraint. The linear program in (Eq. 2) isnot unique, since redundant non-negativity and normalization constraintscan be added to E_(MAP) without affecting the optimal solution or value.Consequently, different duals are solved in the literature. Theparticular LP dual of E_(MAP), which will be used in the remainder, isgiven by

$\begin{matrix}{{{E_{MAP}^{*}(\lambda)} = {\sum\limits_{s}\rho_{s}}}{{s.t.\mspace{14mu} {\forall{x_{s}\text{:}\rho_{s}}}} = {{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\lambda_{\alphaarrow s}( x_{s} )}}}}{\forall{{x_{\alpha}\text{:}{\sum\limits_{s \in \alpha}{\lambda_{\alphaarrow s}( x_{s} )}}} \leq {\theta_{\alpha}( x_{\alpha} )}}}} & ( {{Eq}.\mspace{14mu} 3} )\end{matrix}$

The first set of constraints will be referred to as the balanceconstraint and the second set of constraints is termed capacityconstraints. Since the unknown (ρ_(s))_(s∈ν) play only the role ofauxiliary variables, they are dropped as argument to E_(MAP)* tosimplify the notation. Without loss of generality one requires θ≧0(pointwise) such that λ≡0 is dual feasible.

Even if (Eq 3) is a convex problem (a linear program), optimizingE_(MAP)* is not straightforward. Generic LP codes do not exploit thevery particular structure of the problem, and first order methodsexhibit slow convergence in practice due to the non-smooth objective. Asuccessful class of algorithms to solve E_(MAP)* approximately is basedon block coordinate ascent, which performs repeated optimization over asmall but varying subsets of unknowns. Different algorithms are obtainedby different choices of dual energies and subsets of optimized unknowns.One important aspect for the success of these algorithms is that thesubproblems can be solved efficiently, for example, in closed form.These algorithms usually resemble the classical belief propagationalgorithm (which has few guarantees if run on cyclic graphs) and fallunder the umbrella term convex belief propagation. These algorithms havein common, that the dual objective improves monotonically in eachiteration, which renders them convergent (under the mild assumption thatthe optimal value of E_(MAP)* is finite i.e. the problem is bounded).

Dual Coordinate Ascent

In this section a convex belief propagation algorithm is provided whichoptimizes over all variables λ_(α→s) for all cliques a containing s ineach step (i.e. all messages incoming at node s). The convex beliefpropagation described in this section may be used with any of themethods of adjusting an image using message passing described herein, aswill be demonstrated below. It is further shown that this convex beliefpropagation algorithm is equivalent to optimizing over both incoming andoutgoing messages, i.e. block coordinate ascent is performed on a fullstar-like subgraph. Further, it will be shown that there are additionaltuning weights in the algorithm which—depending on the node traversalschedule—have a substantial impact on the observed performance.

If one considers a particular node s and fixes all unknowns other thanρ_(s) and {λ_(α→s)

, the subproblem induced by (Eq. 3) reads as

$\begin{matrix}{{{\max\limits_{\rho_{s},{\{\lambda_{\alphaarrow s}\}}}{\rho_{s}\mspace{14mu} {s.t.\mspace{14mu} \rho_{s}}}} = {{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\lambda_{\alphaarrow s}( x_{s} )}}}}{{\lambda_{\alphaarrow s}( x_{s} )} \leq {\min\limits_{x_{\alpha}\backslash x_{s}}\{ {{\theta_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{\lambda_{\alphaarrow t}( x_{t} )}}} \}}}} & ( {{Eq}.\mspace{11mu} 4} )\end{matrix}$

The right hand side of the inequality constraints are defined asμ_(α→s)(

_(s)),

$\begin{matrix}{{\mu_{\alphaarrow s}( x_{s} )}\overset{def}{=}{\min\limits_{x_{\alpha}\backslash x_{s}}{\{ {{\theta_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{\lambda_{\alphaarrow t}( x_{t} )}}} \}.}}} & ( {{Eq}.\mspace{14mu} 5} )\end{matrix}$

Non-negative weights W_(α→s) are introduced such that

w_(α→s)=1 but otherwise chosen arbitrarily) and the ansatz λ_(α→s)(

_(s))=μ_(α→s)(

_(s))−w_(α→s)δ_(s)(

_(s)) is used for some δ_(s)(

_(s))≧0 to obtain the equivalent problem to (Eq. 4),

$\begin{matrix}{{\max\limits_{\rho_{s},{\delta_{s} \geq 0}}{\rho_{s}\mspace{14mu} {s.t.\mspace{14mu} \rho_{s}}}} = {{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\mu_{\alphaarrow s}( x_{s} )}} - {\delta_{s}( x_{s} )}}} & ( {{Eq}.\mspace{14mu} 6} )\end{matrix}$

The choice of these weights and their impact on the convergence ratewill be discussed later in this section. Since δ_(s) (

_(s))≧0, the largest allowed value for ρ_(s) is given by

$\begin{matrix} \rho_{s}arrow{\min\limits_{x_{s}}\underset{\begin{matrix} \\{\overset{def}{=}{v_{s}{(x_{s})}}}\end{matrix}}{\{ {{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\mu_{\alphaarrow s}( x_{s} )}}} \}}}  & ( {{Eq}.\mspace{14mu} 7} )\end{matrix}$

and δ_(s)(

_(s)) and λ_(α→s) are consequently given by

$ {\delta_{s}( x_{s} )}arrow{{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\mu_{\alphaarrow s}( x_{s} )}} - \rho_{s}}  = {{v_{s}( x_{s} )} - {\rho_{s}( x_{s} )}}$λ_(α → s) ← μ_(α → s)(x_(s)) − ω_(α → s)δ_(s)(x_(s)).

Via complementary slackness it is easy to see that if is dual optimal,then ν_(s)(

_(s))>ρ_(s) implies b_(s)(

_(s))=0 in the primal solution of E_(MAP). Algorithm 1 summarizes thisconvex BP method.

Algorithm 1 Node-based message passing   Require: Aibitrary feasible λand ρ, weights w_(α→s) 1: while not converged do 2:  loop over s ∈ V andassign for all α

 s 3:   Node update:$ {\mu_{\alphaarrow s_{\;}}( x_{s} )}arrow{\min\limits_{x_{\alpha}\backslash x_{s}}\{ {{\theta_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{\lambda_{\alphaarrow t}( x_{t} )}}} \}} $  $ \rho_{s}arrow{\min\limits_{x_{s}}\{ {{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\mu_{\alphaarrow s}( x_{s} )}}} \}} $ $ {\delta_{s}( x_{s} )}arrow{{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\mu_{\alphaarrow s}( x_{s} )}} - \rho_{s}} $ λ_(α→s) ← μ_(α→s) (x_(s)) − w_(α→s)δ_(s)(x_(s)) 4:  end loop 5: endwhile

Before discussing the impact of the choice of weights w_(α→s), thefollowing is stated:

Result 1. Algorithm 1 is equivalent to performing dual coordinate ascentwith respect to both incoming messages λ_(α→s) and reverse messagesλ_(α→t), t∈α\s. This means that for a fixed node s messages λ_(α→t) areupdated for all cliques a containing s.

Proof of Result 1: Update λ_(α→s) and for λ_(α→t) for a node s and itsneighbouring node t (i.e. nodes sharing a clique with s). Thus, maximise

${\max\limits_{\rho_{s},{\{\rho_{t}\}},{\{\lambda_{\alphaarrow t}\}}_{\alpha \ni s}}\rho_{s}} + {\sum\limits_{t}\rho_{t}}$${s.t.\mspace{14mu} \rho_{s}} = {{\theta_{s}( x_{s} )} + {\sum\limits_{t}{\lambda_{\alphaarrow s}( x_{s} )}}}$$\rho_{t} = {{{\overset{\sim}{\theta}}_{t}( x_{t} )} + {{\lambda_{\alphaarrow t}( x_{t} )}\mspace{14mu} {\forall{t \neq s}}}}$${{\sum\limits_{r \in \alpha}{\lambda_{\alphaarrow r}( x_{r} )}} \leq {\theta_{\alpha}( x_{\alpha} )}},$

where θ _(t)(

_(t)) is the reparametrized unary potential, θ _(t)(

_(t))

θ_(t)(

_(t))+Σ_(β≠α)λ_(β→t)(

_(t)). Observe that one can replace each ρ_(t) by ρ_(t)−ε_(t) (for allt∈α\s), and substitute ρ^(s) with ρ_(s)+Σ_(t)ε_(t) (and setting andλ_(α→s)←λ_(α→s)+Σ_(t∈α\s)ε_(t) and λ_(α→t)←λ_(α→t)+ε_(t)) withoutchanging the objective (or violating constraints). Thus, it is possibleto assume that ρ_(t)=0 without loss of generality, and λ_(α→t)(

_(t))=−θ _(t)(

_(t)). Plugging this into the maximization problem yields

${\max\limits_{\rho_{s},{\{\lambda_{\alphaarrow s}\}}}{\rho_{s}\mspace{14mu} {s.t.\mspace{14mu} \rho_{s}}}} = {{\theta_{s}( x_{s} )} + {\sum\limits_{\alpha \ni s}{\lambda_{\alphaarrow s}( x_{s} )}}}$${{\lambda_{\alphaarrow s}( x_{s} )} \leq {{\theta_{\alpha}( x_{\alpha} )} + {\sum\limits_{t \in {\alpha \backslash s}}{{\overset{\sim}{\theta}}_{t}( x_{t} )}}}},$

This variant of dual coordinate ascent is not stronger than optimizingsolely over incoming messages λ_(α→s). As one has the freedom toarbitrarily assign values to all ρ_(t), the values ρ_(t) can be keptconstant, i.e. one does not need to maximize with respect to ρ_(t) atall. In this case λ_(α→t) is fixed to its old value (which follows fromρ_(t)=θ _(t)(

_(t))+λ_(α→t)(

_(t)) and maximization is performed only with respect to and λ_(α→s)thus optimizing with respect to both messages is equivalent to theoriginal method in Algorithm 1.

The block coordinate method in Algorithm 1 optimizes over fewer unknownsthan, for example, the tree block-coordinate one, updates a larger setof unknowns than min-sum-diffusion or MPLP.

Stable points: Recall that a block-coordinate method applied on a (notstrictly) convex (or concave) problem is only guaranteed tomonotonically improve the objective but does not necessarily achieve oneoptimal solution as its fixed point. Convergence to a fixed pointfollows from monotonicity. Reasoning about fixed points is actually toorestrictive, since the dual objective Σ_(s)ρ_(s) will usually remainconstant long before λ reaches a fixed point. Hence we relax the fixedpoint condition and will introduce stable points shortly. Another slightcomplication arises from the fact, that the assignment of contributionsρ_(s) to the dual objective value ν*=Σ_(s)ρ_(s) is not unique: one canarbitrarily shift quantities between nodes by adjusting the messageswithout changing the objective or active constraints (i.e. withoutchanging the primal solution). Fortunately, this ambiguity is fixed byupdating only ρ_(s) for a single node in each node update step inAlgorithm 1 (i.e. the freedom to modify ρ_(t) at neighbouring nodes t isnot used), and one can define stable points:

Definition 1. λ is called a stable point for E_(MAP)*(·|θ) if thefollowing condition is met:

∀s:∃

_(s) :∀α

s:λ _(α→s)(

_(s))=μ_(α→s)(

_(s))  (Eq. 8)

(using the definition of μ_(α→s) (

_(s)) as in (Eq. 5)).

In other words, λ is stable if for all nodes there exists a state withall capacity constraints being active. In some sense stable points arefixed points for node updates:

Result 2. If λ is a stable point, running Algorithm 1 (with anytraversal schedule for nodes) will not improve the dual objectiveE_(MAP)*(·|θ).

Before sketching the proof, the notion of active states is introduced:

Definition 2. Let potentials θ and messages λ be given. Using thenotation as in Algorithm 1, a state

_(s) is called active, if ν_(s)(

_(s))=min_(x′) _(s) ν_(s)(

′_(s))=ρ_(s).

A state

_(s) being active at node s means that all capacity constraints areactive for cliques a containing s, thus Σ_(t∈α)λ_(α→t)(

_(t))=θ_(α)(x_(α)). If w_(α→s)>0, i.e. the weights are chosen from theinterior of the unit simplex, the converse is also true.

Result 2 can be seen as follows: a node update at s (Algorithm 1) canonly improve the value of ρ_(s), if at least one capacity constraint fora clique α

s becomes inactive for every previously active state. If this is not thecase, then ρ_(s)=

π_(α→s)(

_(s)) also remains constant, and it is easy to see that then λ_(α→s)(

_(s)) also remains constant for active states. Messages may change forinactive states.

Traversal schedule and choice of weights: The scheduling policy in whichorder the nodes s∈ν are traversed and the exact choice of w_(α→s) areunspecified parameters of the algorithm. Intuitively, different choicesfor the node traversal schedule and weights may be beneficial for thespeed of convergence, since relevant message information may bepropagated faster depending on the schedule and employed weights. Anon-uniform weighting assigning larger weights to forward (hyper)-edges,means that messages incoming at successor nodes t will have larger upperbounds μ_(α→t) and the subproblem (Eq. 4) at node t is therefore lessconstrained. In FIG. 3 the evolution of dual energies with respect tothe number of passes of the image for a dense stereo correspondenceproblem is illustrated.“Seq” refers to a schedule, that alternatesbetween sequential top-to-bottom, left-to-right traversal (and itsreverse), “par” is the schedule of (possibly simultaneously) updatingevery other pixel in the image, and “row” refers to a row-parallelschedule that simultaneously traverses every other row in the image andreverses the direction after every pass. We show results for uniform andnon-uniform weight assignments. The combination “seq/non-uniform” hasthe fastest convergence speed if run on a sequential processor. Sincethe policy “row/non-uniform” is suitable for parallel implementation, weuse this policy in our experiments.

The higher resolution stereo pair of the “Cones” dataset is used togenerate the graph in FIG. 3. The unary potentials are NCC-inducedcosts,

$\frac{1}{\tau}\min \{ {\tau,\frac{1 - {ZNCC}}{2}} \}$

with τ=0.5 and the NCC score computed on a 5×5 grayscale patch. ZNCC isthe zero-mean NCC of 5×5 gray-scale images patches. The P₁-P₂ smoothnessmodel is used. For the “weak regularization” setting P1=¼, P2=1 waschosen and FIG. 4A shows the converged result. To obtain strongregularization P1=1, P2=4 were chosen and FIG. 4(b) shows the convergedresult.

The meaning of the traversal schedules is as follows:

-   -   Seq: sequential scan from top-left pixel to bottom-right one        (for odd passes) and its reverse (for even passes).    -   Row: Sequential scan from left to right for odd rows first, then        even ones. The direction is reversed to right-to-left in every        other pass.    -   Par: update white pixels in a checkerboard pattern first, then        black ones.

The weights are assigned according to the following:

-   -   Uniform weighting: all w_(α→s) are set to 1/deg(s) .    -   Non-uniform weighting: w_(α→s) is set to= 1/32 for backward        edges in the traversal schedule. For the “seq” schedule, forward        edges have weight w_(α→s)=½−ε, such that the total sum of        weights is 1. In the “row” schedule, edges to pixels in the        previous and next row have weight ¼, and therefore the forward        edge has weight 1−2×¼−=½−ε.

Lazy Evaluation and Pessimistic Potentials

In this section it is assumed that the potentials are not given inadvance for each state and need to be computed on demand. One strategyis to utilize a proxy for not-yet queried potentials, and to determinethe states considered promising for subsequent queries while performinginference. Thus, reparametrized costs arising in the inference procedurecan guide the exploration of true values for the respective potentials.It will be shown that using an upper bound for not queried potentials,i.e. “pessimistic” potentials, is highly beneficial in this context.

This section provides the necessary background and focuses on “lazyevaluation” of computationally costly unary potentials, but theinference algorithm keeps the full representation of messages. In thenext section it is described how delayed evaluation of clique potentialsyields compressed message representations, and how it enables moreefficient inference.

Pessimistic potentials: Let θ be the true but only partially knownpotentials, and pessimistic upper bound potentials θ≧θ (point-wise) areavailable. By construction we have

$\begin{matrix}{{\min\limits_{\lambda}{E_{MAP}^{*}( \lambda \middle| \theta )}} \leq {\min\limits_{\lambda}{{E_{MAP}^{*}( \lambda \middle| \overset{\_}{\theta} )}.}}} & ( {{Eq}.\mspace{14mu} 9} )\end{matrix}$

Definition 3. For given potentials θ let θ be a reparametrization of θsuch that θ_(s)(

_(s))=0 for all s and

_(s). Without loss of generality we set

$\begin{matrix} {{\hat{v}}_{\alpha}( x_{\alpha} )}arrow{{\theta_{\alpha}( x_{\alpha} )} + {\sum\limits_{s \in \alpha}{\frac{\theta_{s}( x_{s} )}{\deg (s)}\mspace{14mu} {\vartheta_{s}( x_{s} )}}}}arrow 0.  & ( {{Eq}.\mspace{14mu} 10} )\end{matrix}$

Instead of reasoning about stable points of E_(MAP)*(·|θ) it isbeneficial to work with E_(MAP)*(·|θ) since modifying potentials θ willonly affect the capacity constraints but not the balance conditions. Forclarity, E_(MAP)*(·|θ) is here restated:

$\begin{matrix}{{{E_{MAP}^{*}( \lambda \middle| \vartheta )} = {{\sum\limits_{s}{\rho_{s}\mspace{14mu} {s.t.\mspace{14mu} \rho_{s}}}} = {\sum\limits_{\alpha \ni s}{\lambda_{\alphaarrow s}( x_{s} )}}}}{{\sum\limits_{s \in \alpha}{\lambda_{\alphaarrow s}( x_{s} )}} \leq {{\vartheta_{\alpha}( x_{\alpha} )}\mspace{14mu} ( {= {{\theta_{\alpha}( x_{\alpha} )} + {\sum\limits_{s \in \alpha}\frac{\theta_{s}( x_{s} )}{\deg (s)}}}} )}}} & ( {{Eq}.\mspace{14mu} 11} )\end{matrix}$

This correspondence also carries over to attributed potentials such as θ

θ etc. For these reparametrized potentials it is clear that if λ isfeasible for E_(MAP)*(·|θ), then it is also feasible for E_(MAP)*(·|θ).The interesting question is the following: when are stable points λ forE_(MAP)*(·|θ) also stable points for E_(MAP)*(·|θ),

Result 3. Let λ be a stable point for E_(MAP)*(·|θ). If λ is feasiblefor E_(MAP)*(·|θ). then λ is also a stable point for E_(MAP)*(·|θ).Proof. Let be a stable point for E_(MAP)*(·|θ) and feasible with respectto (·|θ). Fix a node s. The essential quantities in Algorithm 1 appliedon λ are

$\begin{matrix}{{{{\overset{\_}{\mu}}_{\alphaarrow s}( x_{s} )}\overset{def}{=}{\min\limits_{x_{\alpha}\backslash x_{s}}\{ {{{\overset{\_}{\vartheta}}_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{{\overset{\_}{\lambda}}_{\alphaarrow t}( x_{t} )}}} \}}}{{{\overset{\_}{v}}_{s}( x_{s} )}\overset{def}{=}{\sum\limits_{\alpha \ni s}{{{\overset{\_}{\mu}}_{\alphaarrow s}( x_{s} )}.}}}} & ( {{Eq}.\mspace{14mu} 12} )\end{matrix}$

Since λ is a fixed point for θ have

$\begin{matrix}{{\sum\limits_{\alpha}{{\overset{\_}{\lambda}}_{\alphaarrow s}( x_{s} )}} = {{\overset{\_}{\rho}}_{s} = {\min\limits_{x_{s}^{\prime}}{{{\overset{\_}{v}}_{s}( x_{s}^{\prime} )}.}}}} & ( {{Eq}.\mspace{14mu} 13} )\end{matrix}$

Updating the messages incoming at s now with respect to the potential θrequire computation of

$\begin{matrix}{ {\mu_{\alphaarrow s}( x_{s} )}arrow{\min\limits_{x_{\alpha}\backslash x_{s}}\{ {{\vartheta_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{{\overset{\_}{\lambda}}_{\alphaarrow t}( x_{t} )}}} \}}  {v_{s}( x_{s} )}arrow{\sum\limits_{\alpha \ni s}{\mu_{\alphaarrow s}( x_{s} )}} } & ( {{Eq}.\mspace{14mu} 14} )\end{matrix}$

and ρ_(s)←min

_(s) ν_(s)(

_(s)). By construction we have μ_(α→s)(

_(s))_≦μ _(α→s)(

_(s)), ν_(s)(

_(s))≦ν _(s)(

₂), and ρ_(s)≦ρ _(s). But since λ is feasible with respect to θ it isknown that

$\begin{matrix}{{{\overset{\_}{\lambda}}_{\alphaarrow s}( x_{s} )} \leq {{\vartheta_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{{\overset{\_}{\lambda}}_{\alphaarrow t}( x_{t} )}}}} & ( {{Eq}.\mspace{14mu} 15} )\end{matrix}$

i.e. λ _(α→s)(

_(s))≦μ_(α→s)(

_(s)). This implies that

$\begin{matrix}{{\overset{\_}{\rho}}_{s} = {{{\sum\limits_{\alpha \ni s}{{\overset{\_}{\lambda}}_{\alphaarrow s}( x_{s} )}} \leq {\min\limits_{x_{s}^{\prime}}\{ {\sum\limits_{\alpha \ni s}{\mu_{\alphaarrow s}( x_{s}^{\prime} )}} \}}} = \rho_{s}}} & ( {{Eq}.\mspace{14mu} 16} )\end{matrix}$

and therefore ρ=ρ_(s). Thus, the objective does not improve by updatingthe messages incoming at s. Further, for active states

_(x)(w.r.t. λ) one has λ _(α→s)(

_(s))=μ _(α→s)(

_(s)) (since the capacity constraints are active for active states), andcombining this with λ _(α→s)(

_(s))≦μ_(α→s)(

_(s)) and μ_(α→s)(

_(s))≦λ _(α→s)(

_(s)) one obtains

λ _(α→s)(

_(s))=μ_(α→s)(

_(s))  (Eq. 17)

Hence, for active states δ_(s)(

_(s))=0 and the updated messages λ_(α→s)(

_(s)) for these states are given by

λ_(α→s)(

_(s))←μ_(α→s)(

_(s))=μ _(α→s)(

_(s))=λ _(α→s)(

_(s))  (Eq. 18)

and λ a stable point for E_(MAP)*(·|θ).

The relevance of this result in our setting is the following: givenupper bounds on the true potentials, one can interleave sampling (orexploration) of their true value with MAP inference via successive nodeupdates (or exploitation in a wider sense), and in the limit thatcombined exploration exploitation strategy still leads to a stable pointof the full inference problem. It also tells—given current messagesλ—which states are good candidates to query the respective truepotential: they are the ones that are more likely to violate thecorresponding capacity constraint.

Lazy evaluation of unary potentials: The typical setting is that theunary potentials are non-parametric and costly to evaluate data terms,and that clique potentials (usually pairwise ones) are parametric andinexpensive to compute. Hence for the purposes of the presentembodiment, one is interested in an approach that leads to “guided”evaluation of unseen data terms, which are then used in subsequent nodeupdates.

In the following we will assume that the unary potentials are boundedfrom above, e.g. _(s,)

_(s) θ_(s)(

_(s))≦1. The upper bound may be dependent on but for simplicity assume aconstant upper bound. Partial knowledge of (unary) potentials combinedwith an upper bound on the unknown ones leads to related MAP inferenceproblem:

Definition 4. For each s∈ν let L(s) be the set of resident states forwhich the true unary potentials are known, and {circumflex over (θ)} isconstructed as follows:

$\begin{matrix}{{{\hat{\theta}}_{s}( x_{s} )} = \{ \begin{matrix}{\theta_{s}( x_{s} )} & {{{if}\mspace{14mu} x_{s}} \in {L(s)}} \\{1\mspace{45mu}} & {{{otherwise}.}\mspace{14mu}}\end{matrix} } & ( {{Eq}.\mspace{14mu} 19} )\end{matrix}$

and {circumflex over (θ)}_(α)(x_(α))=θ_(α)(x_(α)).

Note that for {circumflex over (θ)} there are correspondingreparametrized potentials θ with vanishing unary potentials (recall Def.3). With these definitions it is possible to present a meta-algorithmfor MAP inference with lazy evaluation of data terms in Algorithm 2:

  Require: Number of passes T, initial cardinality |L(s)| Require:Weights w_(α→s) 1: Initialize L(s) for all s (e.g. randomly) 2:

3: Compute initial {circumflex over (θ)}, initialize all messages λ with0 4: for k = 1 . . . T do 5:  loop over s ∈ V 6:   

7:   

8:   Find potentially violating state x_(s) ⁺ 9:   L(s) ← L(s) ∪ x_(s)⁺, add θ_(s)(x_(s) ⁺) to {circumflex over (θ)} 10:   Apply node updatew.r.t {circumflex over (v)} 11:  end loop 12: end for 13: Optional: runAlg. 1 to roach a stable point λ Note: The framed instruction only applyto the limited memory variant discussed in the section entitled“Limited-memory PM-CBP”.

The same remarks on the node traversal schedule and choice of weightsw_(α→s) as for Algorithm 1 apply. In the limit T→∞ every state isexplored in algorithm, and a stable point for the full MAP inferenceproblem is obtained. The algorithm may also stop earlier if no violatingstate is found at any of the nodes, The most relevant application willbe when T is a constant value to meet, for example, a runtime budget.This leads to the main open design choice in the algorithm: how to finda state

_(s) such that instantiating θ_(s)(

_(s)) leads to the largest subsequent reduction in the objective. Inorder to describe the principle, we assume for now that all states areconsidered at pixel s. In practice neighbouring pixels are utilized togenerate a small set of candidate states, which will be described in thesection entitled “Limited-memory PM-CBP” below. The selected possiblepixel label,

_(s) ⁺ for which the unary potential is to be calculated may thereforebe selected from a candidate set of possible pixel labels, wherein thecandidate set of possible pixel labels is based on an estimated set ofpossible pixel labels of an adjacent node for which the unary potentialis known. In this example, the state

_(s) ⁺ to evaluate the unary potential for is determined as the statewith the smallest value of ν_(s)(

_(s)) by using an estimated cost) in the place of the unknown θ_(s)(

_(s)), i.e.

_(s) ⁺ is set to

$\begin{matrix}{{\arg \mspace{14mu} {\min\limits_{x_{s} \notin {L{(s)}}}{\sum\limits_{\alpha \ni s}{\min\limits_{x_{\alpha}\backslash x_{s}}\{ {{{\overset{\_}{\vartheta}}_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{\lambda_{\alphaarrow t}( x_{t} )}}} \}}}}},} & ( {{Eq}.\mspace{14mu} 20} )\end{matrix}$

where θ is obtained from {tilde over (θ)} by substituting {tilde over(θ)}_(s)(

_(s)) with θ _(s)(

_(s)). Note that θ _(s)(

_(s)) should be the true data term θ_(s)(

_(s)) or a lower bound thereof for Result 3 to hold, In practice it ispossible to use the smallest unary potential from the neighbours as theestimate. i.e. θ _(s)(

_(s))←min t∈N(s){tilde over (θ)}_(t)(

_(t)), which appears to work well. Overall, determining

_(s) ⁺ according to (Eq. 20) essentially amounts to performing one nodeupdate step of Algorithm 1 and has the same runtime complexity.

While above it was shown that the schedule and the weights influence theconvergence speed for standard convex BP, PM-CBP benefits as wellespecially if the algorithm is run for a few passes. FIG. 5 shows thatthe “row+non-uniform” setting achieves a lower primal energy muchquicker than schedules using a uniform weighting.

FIG. 6 is a flowchart showing a computer implemented method foradjusting an image using message passing. The method may be performedby, for example, the architecture described above in relation to FIG. 1.At step 610 an input image is received. The input image comprises aplurality of pixels, each pixel having an input value and each inputvalue corresponding to one of a number of possible pixel labels orstates.

At step 612, each pixel of the input image is associated with a node ofa graph and one or more cliques of nodes. Each possible pixel label isassociated with a unary potential and, for each permutation of possiblepixel labels of the pixels associated with the nodes of each clique, aclique potential is associated.

At step 614, for each clique of the graph, at least one message betweeneach pair of nodes of that clique is initialised. For example, as withAlgorithm 2 described above, all messages λ are initialised to 0.

At step 616, a node s is selected according to a node traversalscheduling policy. For example, the node traversal scheduling policy mayindicate that nodes of the graph are to be traversed sequentially in thesame order each time.

At step 618, for that node a respective set of possible pixel labels forwhich the unary potential is known is determined. In other words, forthat node the set of resident states L(s) is determined. An array {tildeover (θ)}_(s)(

_(s)) may then be defined accordingly (see (Eq. 19)).

At step 620 a state

_(s) ⁺ is selected, the state

_(s) ⁺ not belonging to the respective set of possible pixel labels forwhich the unary potential is known. As described above, the state

_(s) ⁺ may be selected based on a determination of the state with thesmallest value of ν_(s)(

_(s)) by using an estimated cost {tilde over (θ)}_(s)(

_(s)).

At step 622, the unary potential of the selected possible pixel label iscalculated.

At step 624, the respective set L(s) of possible pixel labels for whichthe unary potential is known is updated to include the selected state

_(s) ⁺. The array {circumflex over (θ)}_(s)(

_(s)) may be updated accordingly.

At step 626, for each clique to which node s belongs, the cliquepotential associated with that clique is adjusted or reparameterisedbased on the updated respective set of possible pixel labels L(s) Forexample, the clique potentials are adjusted as in (Eq. 10), which alsoshows the unary potentials being adjusted to zero.

At step 628, a node update is performed according to Algorithm 1, usingthe adjusted clique potentials. This has the effect of, for each cliqueto which the node s belongs, adjusting, based on the adjusted cliquepotential associated with that clique, at least one of the messagesbetween nodes and the other nodes of that clique.

At step 630, a determination is made as to whether or not a convergencecriterion has been met. If the convergence criterion has not been met,the method proceeds to step 632, in which a different node is selectedaccording to the node traversal scheduling policy. The method thenreturns to step 618. The convergence criterion may be, for example, thata solution to the dual problem of (Eq. 11) has been found. Theconvergence criterion may be, for example, that the method has traversedover all nodes a predetermined number of times.

If, at step 630, a determination is made that the convergence criterionhas been met, then the method proceeds to step 634 and an adjusted imageis produced having pixel labels determined from the adjusted messages.If, for example, the convergence criterion is that a solution to thedual problem of (Eq. 11) has been found, then the adjusted image may beproduced by translating the solution to a solution of a correspondingprimal problem and assigning the labels to the pixels accordingly.

Limited-Memory PM-CBP

The main benefit of the basic PM-CBP algorithm is only apparent whenearly stopping is applied: in this case only a subset of unarypotentials is evaluated, and the final convex BP iterations converge tothe solution of a proxy MAP instance with partially pessimisticpotentials. Runtime savings come from not evaluating all data terms(which can lead to substantial savings). Interestingly, one can go farbeyond that to derive a limited memory version of PM-CBP, whichmaintains a constant number of states and respective messages. This isachieved using the following two observations: (i) by using proxy upperbounds for clique potentials, it is possible to represent all messagesfor non-resident states at a node by a single value, and (ii) at leastone resident state can be made non-resident without decreasing theobjective. It is therefore sufficient to maintain only three residentstates per node. The method is outlined in Algorithm 2 (now includingthe framed instructions), and is explained in more detail in thefollowing.

Group state “*”. The basic concept is to extend the use of upper boundsfor unseen unary potentials (i.e. for non-resident states) to cliquepotentials. If any element of a clique state x_(α) is not resident (i.e.not in L(s)), the assigned potential {tilde over (θ)}_(α)(x_(α)) is anupper bound of the true value. In the experiments described hereintruncated pairwise potentials are used, hence an upper bound is easilyavailable. Adding a state

_(s) to the resident set does not only update {tilde over (θ)}_(s)(

_(s))←θ_(s)(

_(s)) (the true value), but it may also lead to the substitution {tildeover (θ)}_(α)(x_(α))←θ_(α)(x_(α)), if x_(α) consists now of onlyresident states. Since every unary and clique potential involving anon-resident state is constant for all states

_(s)∉L(s), it also implies that the messages λ_(α→s)(

_(s)) after a node update will attain the same value for all

_(s)∉L(s). Hence, the set of messages {λ_(α→s)(

_(s))}

_(s) _(∉L(s)) can be represented by a single value, which is denoted byλ_(α→s)(*). The group state “*” simultaneously represents allnon-resident states. Thus, introducing a single message for allnon-resident states does not affect the validity of the basic PM-CBRalgorithm. The resident sets L(s) will grow by one element in each pass.In order to have a fixed sized resident set, the “least active” state

_(s) ⁻ is discarded i.e. the state

_(s)∈L(s) with the largest value of ν_(s)(

_(s)). That is, for each clique to which the node s belongs, there is anassociated parameter μ_(α→s)(

_(s)), the associated parameter dependent on the possible pixel labelsfor node s. The one of the possible pixel labels

_(s) ⁻ which is removed from the resident set L(s) is the possible pixellabel for which the sum ν_(s)(

_(s)) of the parameters μ_(α→s)(

_(s)) is greatest. Removal of

_(s) ⁻ from L(s) will therefore not reduce the objective.

Choice of

_(s) ⁺: The state

_(s) ⁺ to add to L(s) was determined by scanning over all non-residentstates in the previous section (recall (Eq 20)). If we maintain a smallset of resident states L(s), this exhaustive scan may dominate theoverall runtime complexity. Hence, it is sensible to generate potentialcandidates at node s based on resident states at neighbouring nodes t.Since (parametric) clique potentials usually encode smoothnessassumptions, it is reasonable to randomly sample states using a densityp(

_(s)|x_(α\s))∝exp(−θ_(α)(x_(α))). Since we do not assume the potentialsto be calibrated, there is a global scale ambiguity in addition to localbias ambiguities (one degree of freedom per clique), which do not affectthe overall MAP solution. Therefore, in practice it is possible toestimate distribution parameters from training data. Training data canbe ground truth labelings or MAP solutions generated by full-scaleinference. The set of resident clique states not containing s are

L(α\s)

{x _(α\s):

_(t) ∈L(t)}  Eq. (21)

which allows the definition of the random proposal sets,

$\begin{matrix}{{C_{s}(\alpha)} = {\bigcup\limits_{X_{\alpha \backslash s} \in {L{({\alpha \backslash s})}}}\{ {x_{s}^{\prime} \sim {p( x_{s}^{\prime} \middle| x_{\alpha \backslash s} )}} \}}} & ( {{Eq}.\mspace{14mu} 22} )\end{matrix}$

and C(s)=

C_(s)(α)\L(s). C(s) is the set of random proposal generated fromneighbouring resident states (minus the already resident states at s).Finally, the new resident state

_(s) ⁺ is determined in analogy to (Eq. 20),

$ x_{s}^{+}arrow{\arg \mspace{14mu} {\min\limits_{x_{s} \in {C{(s)}}}{\sum\limits_{\alpha \ni s}{\min\limits_{x_{\alpha}\backslash x_{s}}{\{ {{{\overset{\sim}{\vartheta}}_{\alpha}( x_{\alpha} )} - {\sum\limits_{t \in {\alpha \backslash s}}{\lambda_{\alphaarrow t}( x_{t} )}}} \}.}}}}} $

If the estimated costs θ _(α) are the true potential values orrespective lower bounds and p(

′_(s)|x_(α\s)) assigns a positive probability to every state, such thatevery state

_(s) is in infinitely often when T→∞, then limited memory PM-CBP willreach a stable point of the full inference problem. This follows fromResult 3 and the fact that feasibility of the current messages areinfinitely often tested. It also implies that maintaining messages forthree states is sufficient: one for the group state *, one for an activestate, and one slot for

_(s) ⁻/

_(s) ⁺.

The data structure for messages in limited memory PM-CBP can be just afixed size array. The slots for λ_(α→s)(

_(s) ⁻) can be reused for λ_(α→s)(

_(s) ⁺). In general, the algorithm requires only fixed size,pre-allocated data structures, which is a large benefit for GPU orembedded implementations.

An optional subprocess is now described in relation to FIG. 7, which maytake place within the process described above in relation to FIG. 6.FIG. 7 therefore shows a number of steps already described above inrelation to FIG. 6 and which are accordingly allocated the samereference numerals.

After step 618, in which a respective set L(s) of possible pixel labelsis determined for node s the method proceeds to step 702 in which astate

_(s) ⁻ of L(s) is selected. As described above, the state

_(s) ⁻ may be selected based on a determination of the state with thegreatest value of ν_(s)(

_(s)).

At step 704 the respective set of possible pixel labels L(s) is updatedby removing one of the possible pixel labels therefrom. Accordingly, andwith reference to (Eq. 19) above, in the array {tilde over (θ)}_(s)(

_(s)), the known unary potential θ_(s)(Θ_(s) ⁻) is replaced with 1. Asdescribed above, this allows for the messages λ_(α→s)(

_(s) ⁻) to be represented by the group state λ_(α→s)(*).

After step 704, the method proceeds to step 620 in which a state forwhich the unary potential is unknown is selected, and the methodproceeds as in FIG. 6. In this way, the resident set L(s) stays aconstant size for each node update (step 626) of node s.

Applications

The methods described herein may be used to perform a number of imageprocessing tasks, as described below. In particular, by performing themethods described herein a solution, or approximate solution, to aprimal problem such as that of (Eq. 2) may be found in a time-efficientand memory-efficient manner. The image processing task performed on theimage depends on the parameters used to define the problem.

The performance of the methods described herein on dense correspondenceproblems is demonstrated. The general parameters are as follows: PM-CBPis performed with 5 resident states (plus one to represent allnon-resident states) for the indicated number of passes T followed by afixed number of 32 convex BP iterations to refine the messages, Since a4-neighborhood is used for the pairwise terms, the memory consumption is6×4=24 times the image resolution floating point values. Primalsolutions are extracted simply by reporting the state for each pixelwith the smallest min-marginal ν_(s)(

_(s)). The algorithm is implemented in straightforward C++ with OpenMPenabled, and the runtimes are reported for a dual Xeon E5-2690 systemwith 64 Gb of main memory. GPU acceleration is not employed.

Dense disparity estimation: Results are demonstrated on dense stereoinstances from the Middlebury benchmark datasets. The state spacecontains integral disparity values and has between 60 and 240 elements(depending on the image pair). The data term (unary potential) attainsvalues in [0; 1] and is given by

${\frac{1}{\tau}\min \{ {\tau,\frac{1 - {ZNCC}}{2}} \}},$

where ZNCC is the zero-mean NCC of 5×5 gray-scale images patches. τ isfixed to ½. Results are shown for two related pairwise potentials. Thefirst one is a Potts smoothness model, and the second one is the 3-waypairwise potential

$\begin{matrix}{{\theta_{st}( {x_{s},x_{t}} )} = \{ \begin{matrix}{0\mspace{14mu}} & {{{{if}\mspace{14mu} x_{s}} = x_{t}}\mspace{70mu}} \\P_{1} & { {if}\mspace{14mu} \middle| {x_{s} - x_{t}} | = 1} \\P_{2} & {{{otherwise},}\mspace{65mu}}\end{matrix} } & ( {{Eq}.\mspace{14mu} 23} )\end{matrix}$

which is also known as the P₁-P₂ smoothness. From ground truth disparitymaps the relative frequencies of events are estimated

_(s)=

_(t) (≈94%),

_(x)=

_(t)±1 (≈5.8%), and |

_(s)=

_(t)|≧2 (≈0.2%) for neighbouring pixels s, t This defines how candidatestates are sampled in Algorithm 2. FIG. 8 shows the evolution of theattained primal objective with respect to wall clock time for full scaleconvex BP (Algorithm 1) and limited memory PM-CBP. Clearly, PM-CBPachieves a lower energy much faster than convex BP with much lowermemory requirements (5% for “Teddy” and “Cones”, and 2.5% for “Aloe”).The corresponding labelling results returned by PM-CBP after T=4; 8; 16;32; 64; 128 passes are illustrated in FIG. 9. FIG. 9 is a visualillustration of converging stereo results for the (starting at the top)“Cones”, “Teddy”, “Cones hires”, “Teddy hires”, “Aloe”, and “Baby3”datasets after the respective number T of passes.

As the objective optimized in the above discussion is the dual programto the original linear (i.e. primal) program in (Eq.2), a primalsolution can be extracted by complementary slackness. In practice thismeans, that given a solution of (Eq.3) an approximate solution of (Eq.1)is obtained by setting x_(s)*=arg min_(x) _(s) ν_(s)(x_(s)) (recall Eq.7for the definition of ν_(s)(x_(s)).

For dense disparity estimation the unknown label values are thedisparities, and the unary potentials are computed by comparing theimage patch centered at the current pixel in the reference image withthe image patch in the “moving” image centered at the current pixelshifted by the disparity value under consideration. The comparison ofimage patches is based on a truncated zero-mean normalized crosscorrelation value.

Optical flow estimation: Similar numerical experiments were run foroptical flow instances. The state space contains 129² flow vectorscorresponding to a 16 pixel search range at quarter-pixel accuracy. Theoriginal grayscale images were upscaled to 400% and the same ZNCC basedscore as for dense stereo was used (but computed on 11×11 patches fromthe upscaled images). The pairwise smoothness term is the P₁-P₂ modelapplied separately in the horizontal and vertical component of themotion vector. The decrease in primal energy for the solution returnedafter the respective number of passes with respect to wall clock time isshown in FIG. 10. In this case the memory consumption is 6/129² or lessthan 0:04% of running full inference, and usable motion fields areobtained after a few seconds of CPU time. Visualizations of thecorresponding flow fields are depicted in FIGS. 11 and 12 FIG. 11 is avisual illustration of the convergence of the optical flow field for the“Schleffera” dataset after a) 4, b) 8, c) 16, d) 32, e) 64 and f) 128passes. FIG. 12 is a visual illustration of converging optical flowfields for various datasets after a) 4, b) 8, c) 16, d) 32, e) 64 and f)128 passes. The color coding is similar to the Middlebury one, but useshigher saturated colors for better visibility.

Optical flow can be addressed in a very similar way to dense disparityestimation. In order to allow subpixel motion vectors, one can upscalethe original pair of input images to 400% of the original size, andestimate integral motion vectors at this resolution. This yields quarterpixel motion vectors for the original image resolution.

Image segmentation: In this setting the pixel label values to infer areobject categories such as “sky”, “vegetation”, “road”, “building”,“pedestrian”, “car” etc. (these categories are useful if outdoor imagesare segmented, e.g. for medical images the categories might be differenttypes of tissues and organs). The unary potentials can typically betrained from ground-truth data using a machine learning method, and thepairwise clique potential will be usually set to the Potts smoothnessmodel, θ_(st)(x_(s), x_(t))=τ, if x_(s)≠x_(t) (with t greater than 0),and 0 otherwise.

Image denoising: For image denoising the pixel label values are unknownpixel values. The unary potential is derived from the (known or assumed)noise model of the imaging sensor, and the pairwise (or higher-order)clique potentials encode desired image statistics (e.g. how correlatedtwo neighbouring pixel values are in natural images).

Image deblurring and zooming: Image deblurring and zooming may becarried out using a similar method to image denoising. However, a blurkernel is added to the unary potential for each pixel.

Variations of the described arrangements are envisaged. For example,receiving an image may comprise receiving data from an external datasource and processing the received data to produce an image. Receivingan image may comprise generating an image.

In the above discussion, a received image comprises a plurality ofpixels, each pixel having an input value and each input valuecorresponding to one of a number of possible pixel labels. The number ofpixel labels may vary from node to node or may be constant. The numberof possible pixel labels may depend on the application for which themethods described herein are used. For example, if a mask is to begenerated, then the number of possible pixel labels may be 2, with afirst pixel label being assigned to the pixel if a determination is madethat the pixel is related to an object, and a second pixel label beingassigned to the pixel if a determination is made that the pixel is notrelated to the object.

Cliques of the graph may be formed of any number of nodes. Any twocliques may contain the same or a different number of nodes. Theassignment of a node to a particular clique may be performed based onthe problem to be solved.

In the discussion above, in the course of Algorithm 2 the messages wereinitialised at 0. Messages may be initialised at any suitable value.

Determining, for a node, a respective set of possible pixel labels forwhich the unary potential is known, may comprise retrieving informationconcerning the respective set from memory.

Computing a unary potential of a selected possible pixel label for whichthe unary potential is unknown may comprise any suitable method forcomputing a unary potential. For example, one or more known unarypotentials of possible pixel labels associated with adjacent nodes maybe considered, and a suitable one of these unary potentials may beassociated with the selected possible pixel label. For example, labelsthat are promising may be propagated to neighbouring pixels in order torank candidate states to query the true data term.

The selected possible pixel label may be selected according to anycriterion. In the discussion above, the pixel label

_(s) ⁺ was chosen as it was the based on a determination of the statewith the smallest value of ν_(s)(

_(s)). However, the selected possible pixel label may be selected, forexample, randomly.

Adjusting the clique potential for each clique to which a node belongsbased on the unary potentials associated with the updated respective setof possible pixel labels may comprise, for example, weighting the unarypotentials. For example, the unary potentials may be weighted evenlyaccording to the number of nodes adjacent to the node under observation,or may be weighted unevenly.

Producing an adjusted image having pixel labels determined from adjustedmessages may comprise producing a new image having pixel labelsdetermined from adjusted messages. Alternatively, producing an adjustedimage may comprise altering the pixel labels of the received inputimage.

The methods described herein work well if the unary potentials (dataterms) are reasonably discriminative for most pixels. In cases whenlarger regions in the image are non-discriminative (e.g. uniformlycoloured sky regions in a stereo image pair for dense depthcomputation), then the output of the algorithm can look “patchy”. Inorder to avoid this, the algorithm may be run on lower resolutionversions of the input image(s), followed by upscaling the obtainedresult. This result can be used to initialize one state in the residentset at the finer level. This scheme can be applied recursively, i.e. thealgorithm can be run on very coarse images and subsequently rerun athigher resolutions with the upscaled result from the previous level usedas initializer.

The described methods may be implemented by a computer program. Thecomputer program which may be in the form of a web application or ‘app’comprises computer-executable instructions or code arranged to instructor cause a computer or processor to perform one or more functions of thedescribed methods. The computer program may be provided to an apparatus,such as a computer, on a computer readable medium or computer programproduct. The computer readable medium or computer program product maycomprise non-transitory media such as as semiconductor or solid statememory, magnetic tape, a removable computer memory stick or diskette, arandom access memory (RAM), a read-only memory (ROM), a rigid magneticdisc, and an optical disk, such as a CD-ROM, CD-R/W, DVD or Blu-ray. Thecomputer readable medium or computer program product may comprise atransmission signal or medium for data transmission, for example fordownloading the computer program over the Internet.

An apparatus or device such as a computer may be configured to performone or more functions of the described methods. The apparatus or devicemay comprise a mobile phone, tablet, laptop or other processing device.The apparatus or device may take the form of a data processing system.The data processing system may be a distributed system. For example, thedata processing system may be distributed across a network or throughdedicated local connections.

The apparatus or device typically comprises at least one memory forstoring the computer-executable instructions and at least one processorfor performing the computer-executable instructions.

While certain arrangements have been described, these arrangements havebeen presented by way of example only, and are not intended to limit thescope of the inventions. Indeed, the methods, apparatuses and systemsdescribed herein may be embodied in a variety of other forms;furthermore various omissions, substitutions and changes in the form ofthe apparatuses described herein may be made.

Examples are set out below in the following list of numbered clauses.

1.A computer implemented method for producing an adjusted image usingmessage passing, the method comprising performing the following steps:

-   -   i) receiving an input image that comprises a plurality of        pixels, each pixel having an input value and each input value        corresponding to one of a number of possible pixel labels;    -   ii) associating each pixel with a node of a graph and one or        more cliques of nodes,    -   wherein each possible pixel label is associated with a unary        potential, and wherein, for each permutation of possible pixel        labels of the pixels associated with the nodes of each clique, a        clique potential is associated;    -   iii) for each clique, initialising at least one message between        each pair of nodes of that clique;    -   iv) for each node:        -   a) determining, for that node, a respective set of possible            pixel labels for which the unary potential is known;        -   b) computing a unary potential of a selected possible pixel            label for which the unary potential is unknown and updating            the respective set of possible pixel labels to include the            selected possible pixel label;        -   c) for each clique to which that node belongs, based on the            unary potentials associated with the updated respective set            of possible pixel labels, adjusting the clique potential            associated with that clique; and        -   d) for each clique to which that node belongs, adjusting,            based on the adjusted clique potential associated with that            clique, at least one of the messages between that node and            the other nodes of that clique;    -   v) repeating step iv) until a convergence criterion is met; and    -   vi) producing an adjusted image having pixel labels determined        from the adjusted messages.

2. A method according to clause 1, wherein step iv) further comprises,after step a) and before step c), updating the respective set ofpossible pixel labels by removing one of the possible pixel labelstherefrom.

3. A method according to clause 2, wherein for each clique to which thatnode belongs there is an associated parameter related to messages passedbetween that node and other nodes of that clique, the parameterdependent on the possible pixel labels for that node; and wherein theone of the possible pixel labels that is removed from the respective setof possible pixel labels is the possible pixel label of the respectiveset of possible pixel labels for which the sum of the associatedparameters is greatest.

4. A method according to any preceding clause, wherein, at step c),adjusting the clique potential associated with that clique comprisesadjusting the clique potential based on a weighted sum of the unarypotentials associated with the respective set of possible pixel labels.

5. A method according to clause 4, wherein, at step c), adjusting theclique potential associated with that clique further comprises adding aconstant value to the weighted sum for each possible pixel label whichis not included in the respective set of possible pixel labels.

6. A method according to any preceding clause, wherein, at step b), theselected possible pixel label is selected from a candidate set ofpossible pixel labels, wherein the candidate set of possible pixellabels is based on an estimated set of possible pixel labels of anadjacent node for which the unary potential is known.

7. A method according to any preceding clause, wherein repeating stepiv) until a convergence criterion is met comprises repeating step iv) apredetermined number of times.

8. A method according to any of clauses 1 to 6, wherein repeating stepiv) until a convergence criterion is met comprises repeating step iv)until repetition of step iv) does not change the messages.

9. A method according to any preceding clause, wherein at step b),computing a unary potential of a selected possible pixel label comprisesestimating a unary potential of the selected possible pixel label basedon a unary potential associated with a possible pixel label of anadjacent node in the graph.

10. A method according to any preceding clause, wherein, at step d),adjusting at least one of the messages comprises adjusting the at leastone of the messages according to one or more predetermined messageweightings.

11. A method according to any preceding clause, wherein the unarypotentials are configured so as to cause the method to perform one ormore of the following operations upon the input image:

-   -   segmentation;    -   distortion correction;    -   optical flow;    -   dense stereo correspondence;    -   image denoising, deblurring and zooming;

12. A method according to any preceding clause, wherein the cliquepotentials are initially configured so as to cause the method to performone or more of the following operations upon the input image:

-   -   segmentation;    -   distortion correction;    -   optical flow;    -   dense stereo correspondence;    -   image denoising, deblurring and zooming;

13. An apparatus or system arranged to perform the method of anypreceding clause.

14. The apparatus or system of clause 13, wherein the apparatus orsystem comprises:

-   -   input/output means arranged to receive an image;    -   a memory storing instructions for causing a processor to perform        steps of the method of any of clauses 1 to 12; and    -   a processor for performing steps of the method of any of clauses        1 to 12.

15. A non-transitory computer-readable medium comprisingmachine-readable instructions arranged, upon execution by one or moreprocessors, to cause the one or more processors to carry out the methodof any of clauses 1 to 12.

1. A computer implemented method for producing an adjusted image usingmessage passing, the method comprising performing the following steps:i) receiving an input image that comprises a plurality of pixels, eachpixel having an input value and each input value corresponding to one ofa number of possible pixel labels; ii) associating each pixel with anode of a graph and one or more cliques of nodes, wherein each possiblepixel label is associated with a unary potential, and wherein, for eachpermutation of possible pixel labels of the pixels associated with thenodes of each clique, a clique potential is associated; iii) for eachclique, initialising at least one message between each pair of nodes ofthat clique; iv) for each node: a) determining, for that node, arespective set of possible pixel labels for which the unary potential isknown; b) computing a unary potential of a selected possible pixel labelfor which the unary potential is unknown and updating the respective setof possible pixel labels to include the selected possible pixel label;c) for each clique to which that node belongs, based on the unarypotentials associated with the updated respective set of possible pixellabels, adjusting the clique potential associated with that clique; andd) for each clique to which that node belongs, adjusting, based on theadjusted clique potential associated with that clique, at least one ofthe messages between that node and the other nodes of that clique; v)repeating step iv) until a convergence criterion is met; and vi)producing an adjusted image having pixel labels determined from theadjusted messages.
 2. A method according to claim 1, wherein step iv)further comprises, after step a) and before step c), updating therespective set of possible pixel labels by removing one of the possiblepixel labels therefrom.
 3. A method according to claim 2, wherein foreach clique to which that node belongs there is an associated parameterrelated to messages passed between that node and other nodes of thatclique, the parameter dependent on the possible pixel labels for thatnode; and wherein the one of the possible pixel labels that is removedfrom the respective set of possible pixel labels is the possible pixellabel of the respective set of possible pixel labels for which the sumof the associated parameters is greatest.
 4. A method according to claim1, wherein, at step c), adjusting the clique potential associated withthat clique comprises adjusting the clique potential based on a weightedsum of the unary potentials associated with the respective set ofpossible pixel labels.
 5. A method according to claim 4, wherein, atstep c), adjusting the clique potential associated with that cliquefurther comprises adding a constant value to the weighted sum for eachpossible pixel label which is not included in the respective set ofpossible pixel labels.
 6. A method according to claim 1, wherein, atstep b), the selected possible pixel label is selected from a candidateset of possible pixel labels, wherein the candidate set of possiblepixel labels is based on an estimated set of possible pixel labels of anadjacent node for which the unary potential is known.
 7. A methodaccording to claim 1, wherein repeating step iv) until a convergencecriterion is met comprises repeating step iv) a predetermined number oftimes.
 8. A method according to claim 1, wherein repeating step iv)until a convergence criterion is met comprises repeating step iv) untilrepetition of step iv) does not change the messages.
 9. A methodaccording to claim 1, wherein at step b), computing a unary potential ofa selected possible pixel label comprises estimating a unary potentialof the selected possible pixel label based on a unary potentialassociated with a possible pixel label of an adjacent node in the graph.10. A method according to claim 1, wherein, at step d), adjusting atleast one of the messages comprises adjusting the at least one of themessages according to one or more predetermined message weightings. 11.A method according to claim 1, wherein the unary potentials areconfigured so as to cause the method to perform one or more of thefollowing operations upon the input image: segmentation; distortioncorrection; optical flow; dense stereo correspondence; image denoising,deblurring and zooming;
 12. A method according to claim 1, wherein theclique potentials are initially configured so as to cause the method toperform one or more of the following operations upon the input image:segmentation; distortion correction; optical flow; dense stereocorrespondence; image denoising, deblurring and zooming;
 13. Anapparatus or system comprising: input/output means arranged to receivean image; a processor; and a memory storing instructions for causing aprocessor to perform the following steps: i) receiving an input imagethat comprises a plurality of pixels, each pixel having an input valueand each input value corresponding to one of a number of possible pixellabels; ii) associating each pixel with a node of a graph and one ormore cliques of nodes, wherein each possible pixel label is associatedwith a unary potential, and wherein, for each permutation of possiblepixel labels of the pixels associated with the nodes of each clique, aclique potential is associated; iii) for each clique, initialising atleast one message between each pair of nodes of that clique; iv) foreach node: a) determining, for that node, a respective set of possiblepixel labels for which the unary potential is known; b) computing aunary potential of a selected possible pixel label for which the unarypotential is unknown and updating the respective set of possible pixellabels to include the selected possible pixel label; c) for each cliqueto which that node belongs, based on the unary potentials associatedwith the updated respective set of possible pixel labels, adjusting theclique potential associated with that clique; and d) for each clique towhich that node belongs, adjusting, based on the adjusted cliquepotential associated with that clique, at least one of the messagesbetween that node and the other nodes of that clique; v) repeating stepiv) until a convergence criterion is met; and vi) producing an adjustedimage having pixel labels determined from the adjusted messages.
 14. Anon-transitory computer-readable medium comprising machine-readableinstructions arranged, upon execution by one or more processors, tocause the one or more processors to perform the following steps: i)receiving an input image that comprises a plurality of pixels, eachpixel having an input value and each input value corresponding to one ofa number of possible pixel labels; ii) associating each pixel with anode of a graph and one or more cliques of nodes, wherein each possiblepixel label is associated with a unary potential, and wherein, for eachpermutation of possible pixel labels of the pixels associated with thenodes of each clique, a clique potential is associated; iii) for eachclique, initialising at least one message between each pair of nodes ofthat clique; iv) for each node: a) determining, for that node, arespective set of possible pixel labels for which the unary potential isknown; b) computing a unary potential of a selected possible pixel labelfor which the unary potential is unknown and updating the respective setof possible pixel labels to include the selected possible pixel label;c) for each clique to which that node belongs, based on the unarypotentials associated with the updated respective set of possible pixellabels, adjusting the clique potential associated with that clique; andd) for each clique to which that node belongs, adjusting, based on theadjusted clique potential associated with that clique, at least one ofthe messages between that node and the other nodes of that clique; v)repeating step iv) until a convergence criterion is met; and vi)producing an adjusted image having pixel labels determined from theadjusted messages.